Emphasized Accent Phrase Prediction from Text for Advertisement Text-To-Speech Synthesis

نویسندگان

Hideharu Nakajima

Hideyuki Mizuno

Sumitaka Sakauchi

چکیده

Realizing expressive text-to-speech synthesis needs both text processing and the rendering of natural expressive speech. This paper focuses on the former as a front-end task in the production of synthetic speech, and investigates a novel method for predicting emphasized accent phrases from advertisement text information. For this purpose, we examine features that can be accurately extracted by text processing based on current Text-tospeech synthesis technologies. Among features, the word surface string of the main content and function words and the part-of-speech of main function words in an accent phrase are found to have higher potential on predicting whether the accent phrase should be emphasized or not through the calculation of mutual information between emphasis label and features of Japanese advertisement sentences. Experiments confirm that emphasized accent phrase prediction using support vector machine (SVM) offers encouraging accuracies for the system which requires emphasized accent phrase locations as context information to improve speech synthesis qualities.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prosody Prediction from Linguistically Enriched Documents Based on a Machine Learning Approach

One of the main aspects in text-to-speech synthesis is the successful prediction of prosodic events. In this work we deal with the prediction of prosodic phrase breaks, accent tones and boundary tones from a linguistically XML-based enriched input (SOLE-ML) produced by a Natural Language Generator (NLG) system. We first extended the original specification of SOLE-ML in order for the NLG to prod...

متن کامل

Which resemblance is useful to predict phrase boundary rise labels for Japanese expressive text-to-speech synthesis, numerically-expressed stylistic or distribution-based semantic?

To establish Expressive Text-to-speech synthesis, current research studies both the processing of input text and the rendering of natural expressive speech. Focusing on the former as a front-end task in the production of synthetic speech, this paper investigates a novel feature for predicting phrase boundary tone labels which transcribe local fundamental frequency (F0) changes frequently appear...

متن کامل

Corpus-based Generation of F0 Contours Using Generation Process Model for Emotional Speech Synthesis

A corpus-based method was developed for generating fundamental frequency contours in emotional speech synthesis. The method assumes the generation process model and predicts its command parameters (positions and amplitudes) using binary regression trees with the input of linguistic information of the sentence to be synthesized. Because of the model constraint, a certain quality is still kept in...

متن کامل

Accent Sandhi Estimation of Tokyo Dialect of Japanese Using Conditional Random Fields

When synthesizing speech from Japanese text, correct assignment of accent nuclei for input text with arbitrary contents is indispensable in obtaining naturally-sounding synthetic speech. A phenomenon called accent sandhi occurs in utterances of Japanese; when a word is uttered in a sentence, its accent nucleus may change depending on the contexts of preceding/succeeding words. This paper descri...

متن کامل

Focus And Accent In A Dutch Text-To-Speech System

In this paper we discuss an algorithm for the assignment of pitch accent positions in text-to-speech conversion. The algorithm is closely modeled on current linoulstic accounts of accent placement, and assumes a surface syntactic analysis of the input. It comprises a small number of heuristic rules for determining which phrases of a sentence are to be focussed upon; the exact location of a pitc...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Emphasized Accent Phrase Prediction from Text for Advertisement Text-To-Speech Synthesis

نویسندگان

چکیده

منابع مشابه

Prosody Prediction from Linguistically Enriched Documents Based on a Machine Learning Approach

Which resemblance is useful to predict phrase boundary rise labels for Japanese expressive text-to-speech synthesis, numerically-expressed stylistic or distribution-based semantic?

Corpus-based Generation of F0 Contours Using Generation Process Model for Emotional Speech Synthesis

Accent Sandhi Estimation of Tokyo Dialect of Japanese Using Conditional Random Fields

Focus And Accent In A Dutch Text-To-Speech System

عنوان ژورنال:

اشتراک گذاری